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Figure 5(b) 
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DATA RETRIEVAL SYSTEM 

The invention relates to a data retrieval system, in particular 
system for retrieving hierarchically related objects from a rei atio „ al 
database . 



a 



This invention is described in relation to an LDAP (Lightweight 
m^ory Access Protocol, directory server using a relational database 
as a backing store. This i 3 a particularly appropriate exaff|ple q£ 
hierarchical information which is nonetheless stored in a relational 
database, it will be seen from the following description, however, that 
the invention is not restricted to ldap and is of use wherever 
hierarchically related objects (data, have to be stored and retrieved 
from a relational database. 

An LDAP directory of the known type comprises a collection of 
hierarchically related objects, an example of which is shown in Figure 1 
The structure of the directory and content of its objects are typically ' 
determined by the contents of a schema object which is normally itself 
stored in the directory. The contents of this schema object comprise a 
set of object class definitions and a set of structural rules, as shown 
for the above example in Figure 2. The class definitions include a, a 
list of both mandatory ,„> and optional (o, attributes for each object 
class allowed in the directory; and b, . list defining the hierarchical 
relationships between object classes and hence the inheritance rules for 
class definitions, in the above example, all object classes other than 
top ar* subclasses of the class top. thus inheriting the attribute 
objoctciass. 

The structural rules control the arrangement of objects in the 
directory hierarchy and comprise a list of the allowed child object 

i "Ztir? r rent ciass and - for each such ^ ^ 

ZTZr l T T d t0 Pr ° Vide 3 r6latiVe ""i— i— name 

(rdn) for such an object. 

The relative distinguished name (RDN, provides a unique name for an 
ob.ect^t that point in the directory hierarchy , ^ ^ J '~ 

scnewhat Unpredictable for any object, as it ia formed by a combination 
of one or.more of the object's attributes and as can be seen from the 
naming attributes for employees, many different attributes may be used 
for an object at any one point in the tree, ldap objects also have a 



unique name in the directory - the distinguished name (DN) . The DN is 
formed by the successive, sequential concatenation of the RDNs of the 
object itself and its parents, back up to the root of the directory tree. 

Thus, even in the simple case of Figure 1, the RDN for John Doe may 
be init=JDD+ID=005047, and the DN may be orgName=IBM, siteName=Arizona, 
init=JDD+ID=005047; while the RDN for Jane Deer may in fact be 
emplName=Jane Deer and the DN may be orgName=IBM, siteName= Arizona, 
emplName=Jane Deer. 

The nature of the LDAP data model means that hierarchies may be 
varied and complex; similarly the naming scheme for objects as 
exemplified above also permits substantial variability. As a result, the 
schema definition itself for a directory does not provide a mechanism 
that can be easily adapted for storage and access of the directory 
contents. Consequently a data retrieval system is needed whereby objects 
can be assigned to a store with indexing to permit subsequent efficient 
search and retrieval. 

Accordingly, the present invention provides a data retrieval system 
as claimed in claim 1 . 

Embodiments of the invention will now be described with reference 
to the accompanying drawings, in which: 

Figure 1 is a diagram illustrating a conventional LDAP directory? 

Figure 2 is a conventional schema for the LDAP directory of Figure 

1; 

Figure 3 illustrates some hierarchical data; 

Figures 4 (a) & 4 (b) are conventional index and data tables for the 
hierarchical data of Figure 3; 

Figure 5(a) & 5(b) are index and data tables for use in a data 
retrieval system according to a first embodiment of the invention; and 

Figure 6 (a) & 6 (b) are index and data tables for use in a data 
retrieval system according to a second embodiment of the invention. 



The most frequently used search and retrieval operations in the 
LDAP protocol, and for most hierarchical databases, are: 

1. Retrieval of the object from the specification of its DN. Using the 
data of Figure i. for example, retrieve the object orgNaae-Microsof t 
prodName*windows 3.1. 

2 search of the directory tree, from a parent object specified by DN 
where, a value for an attribute is specified, so as to retrieve a subset ' 
of its immediate children. Por example, again using Figure x, search from 
orgName=lBM, siteName-Arizona for emplName* John . 

3. As above, but to retrieve an object and any descendant of the 
object rather than be restricted to immediate children. For example 
retrieve orgN^BM and all its descendent objects. This may be combined 
with further search criteria, for example, emplName-John Doe. 

The precise details of the mapping of directory objects into tables 
is not part of this disclosure. Although in practice the objects will be 
mapped into several tables, it is simpler here to assume that all the 
objects are assigned to a single table. It is necessary, however, that 
each object is assigned a unique identifier. Although it is permissible 
to use, the DN for this purpose, the DN is generally l0 ng and of varying 
length and therefore somewhat unsuitable. Thus, in the present 
embodiments, a unique numeric identifier is assigned to each object. 

Taking the data of Figure 3 for example, a conventional storage 
scheme which might normally be adopted is shown in Figures 4 (a, and 4 «b) 
Two tableware involved; an index table, Figure 4 (a, which maps between ' 
each object on ( a.b.m...y, and unique identifier (l... 7) . and a data 
table which-holds a row for each object. The data table also holds the 
unxque identifer of the object's parent. The data table has columns for 
the various object attributes, including one for the unique object 
identifier, m summary: 

Index; table (row for each object): 
-rj ; Object name (dn) 
Object .identifier 

Data .table (row for each object) 

Object attributes (one column for each) 



Object identifier 
Parent object identifier 

Variations on this general theme may be used e.g. the parent 
identifier may be stored in the index table rather than in the data 
table. However all similar schemes have the characteristics that they 
support direct retrieval of an object whose name is known, but do not 
provide for efficient search and/or retrieval for descendants explained 
in point 3 above. For example, if the objects for the database of Figure 
1 are not stored in order in tables of the type shown in Figure 4, then a 
search for the descendants of IBM, could only retrieve descendants of the 
Arizona object stored sequentially after the location of the Arizona 
object. The search engine would then need to traverse the table again to 
find all children of the Arizona object stored before the Arizona object 
in the table, and so on for these children. 

A retrieval system operating on such an index and data table must 
effectively navigate the hierarchy, thereby resulting in many sequential 
operations and traverses of the data table. If the tables are implemented 
in a relational database this prohibits the use of the relational 
operators to conduct a full descendants search in a single traverse of 
the table. 

The data retrieval system according to the invention overcomes 
these limitations. 

For each object, its position in the hierarchy can be described by 
the sequential concatenation of its own unique in the directory 
identifier with those of their parents, taken in sequence. This 
collection of values will be termed the position key. Note that in the 
LDAP case, the position key would generally not be the same as the DN, 
which is formed by the concatenation of unique at that branch of the tree 
identifiers . 

In a first embodiment of the invention, Figures 5(a) and 5(b), the 
position key (pkl, pk2, pk3) for an object is preferably stored as a set 
of values in a suitable number of columns assigned in the both the index 
and data tables (one column being required for each level in the 
directory tree hierarchy) . The advantage of the position key is that the 
object data is now very easily searched on a hierarchical basis; thus if 
any descendants of a particular object are required, the position key of 



the „rent can be used in , search condition on an obJeot table (see 
exa^le below, . Moreover the hey can also he used to control the Z a l of 
searchin, „ only the Mediate descendants are retired then the "etc, 
above can he further restricted h y requirm, that ali other columns 
assioned to the position, save that for the ievei of the i„ edi ate 

se. r C I ' " Ul1 - VarlaC10nS " th ' 3 *"~ ™ °< «- - the 

Using the tables of Figure 5 (a) and 5 (b) : 

1. A is found by a select on the data table where: identifier^. 

2. The Mediate children of A are found by a select on the data table 
where: pkl»l * pMt-null & pk3=null. 

where !« designate3 the boolean operator -not equal". 

A second embodiment. Figures 6(a\ z * /k. 
hXn 9 63 b(a) & 6(b) ' overcomes the problem of 

the f i„t e»bodi„e„t t. k i„ g significant stor„e i„ the index tLie.Tt 
«1. he seen that is not necessary to use ail the Mmam of the position 
Key to achieve the desired effect. 0» ly the iovest colu™ ,i e that "on 
null co lura ^thest fro* the root, is significant in iden f^n, L 

2^::^^r isvei coiu ™ ^ «*— « *»• «™: h e 

value -in th.. column is the unique object identifier its.l, 
Accordingly, in tne index tab!.. n 9 ure 6,a, . instead of addin, the 
Pos.tion *ey to the contents previously described it is sufficient s add 

««. entitled ,evel. containing the level of that Zc in 
the hierarchy .this indicates the position *ey column relevant to the 
unique identifier) . c co tne 

in the data table, Figure 6(b), given the information content of 
the position * ey . the two colu^s previously described that ~ iy 
conta in the unique ob.ect and parent identifiers are no longer need^d ^ 

The tables are now: 
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index table (row for each object) : 
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Object name 

Object identifier 

Object level in directory tree 



Data table {row for each object) 

Object attributes (one column for each) 

Position key identifier (one column for each level in the 
hierarchy) 

Using the tables of Figures 6(a) & 6(b): 

1. A is found by a select on the index table where: identif ier=l . 

2. The immediate children of A are found by a select on the data table 
where: pkl=l & pk2=!null & pk3=null . 

3. All descendants of A are found by a select on the data table where: 
pkl=l & pk2=!null. 

4. The children of M are found by a select on the data table where: 
pk2=3 & pk3=lnull. 



Thus, in comparison with the conventional tables of Figure 4, the 
index table of the second embodiment contains one extra column, while the 
data table of the second embodiment contains extra columns for two less 
than the number of levels in the hierarchy. However search operations are 
now reduced to a simple traverse of a the data table, instead of the 
complex sequence of selects previously required. 
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CLAIMS 
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1. A data retrieval system in which a plurality of objects having a 
multi-level hierarchical relationship are stored, each object having a 
respective parent and a set of children, said system including an index 
table comprising a respective name and associated identifier for each 
ob^ct> ,md a data table comprising a respective set of attributes and a 
position key associated with each object in the system, each position key 

'™rt' a mieS ° f COmP ° nenta ' each component corresponding to a 
levels the hierarchy, a firat component of said Jcey storing the 
identifier of an associated object, and each successive component storing 
the .identifier of the parent of the object stored in the previous 
component. „ 



» 2. A data retrieval system as claimed in claim ! in which said index 

table deludes an attribute storing the respective level of each object 
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in the system. 
3. 



A data retrieval system as claimed in claim 1 in which said index 
table comprises a position key associated with each object in the system 
and said data table includes a respective identifier associated with each 
object in the system and a respective identifier of the parent of each 
object in; the system. 
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A- method of retrieving an object name for an object stored in the 
data retrieval system claimed in claim 1 or 2 , comprising the steps of- 

a. specifying an identifier of the object; and 

b. retrieving from the index table the name for the object whose 
- identifier matches said specified identifier 

5. A method of retrieving the immediate children of an object stored 

" ^ d3ta r * trieVal 3y8tein clai ^ claim 2 comprising the steps of 

a. - specifying an identifier of the object; 

b, searching the index table to ascertain v. ■ 

oc J . to ascertain the hierarchical level of 

J3 said object; 

Selecting from the data table objects where the contents of the 
Position key component for said level matches the identifier of the 
object; 

d. for said objects where the contents of the position xey 
component for the hierarchical l evel bel0 w said level are not null 
and where the contents of the position key component for the next 
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hierarchical level below said level are null, retrieving the 
respective object identifiers. 

6. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 2, comprising the steps of: 

a. specifying an identifier of the object; 

b. searching the index table to ascertain the hierarchical level of 
said object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
obj ect ; 

d. for said objects where the contents of the position key 
component for the level below said level are not null, retrieving 
the respective object identifiers. 

7. A method of retrieving an object name for an object stored in the 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object; and 

b. retrieving from the data table the name for the object whose 
identifier matches said identifier. 



8. A method of retrieving the immediate children of an object stored 
in the data retrieval system claimed in claim 3 comprising the steps of: 

a. specifying an identifier of the object; and 

b. for objects where the contents of the parent identifier match 
the identifier of the object, retrieving the respective object 
identifiers . 



9. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object; 

b. ascertaining the level of the object according to when a first 
component of said position key matches the identifier of said 
obj ect; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
object; and 

d. for said objects where the contents of the position key 
component for the level below said level are not null, retrieving 
the respective object identifiers. 



10. A method as claimed in claims 5, 6 or 9 where steps c) and d) are 
carried out in a single traverse of said data table. 

11. A method as claimed in claims 5, 6, 8, 9 or 10 where the retrieved 
object identifiers are stored in the first component of respective 
position keys. 
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